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Abstract 

Parameter estimation in two-dimensional diffusion models with only one coordinate observed is 
highly relevant in many biological applications, but a statistically difficult problem. In neuroscience, 
the membrane potential evolution in single neurons can be measured at high frequency, but biophys- 
ical realistic models have to include the unobserved dynamics of ion chaimels. One such model is 
the stochastic Morris-Lecar model, where random fluctuations in conductance and synaptic input are 
specifically accounted for by the diffusion terms. It is defined through a non-linear two-dimensional 
stochastic differential equation with state dependent noise on the non-observed coordinate. The co- 
ordinates are coupled, i.e. the unobserved coordinate is non-autonomous, and we are therefore not in 
the more well-behaved situation of a hidden Markov model. In this paper, we propose a sequential 
Monte Carlo particle filter algorithm to impute the unobserved coordinate, and then estimate param- 
eters maximizing a pseudo-likelihood through a stochastic version of the Expectation-Maximization 
algorithm. The performance is evaluated in a simulation study, and it turns out that even the rate scal- 
ing parameter governing the opening and closing of ion channels of the unobserved coordinate can 
be reasonably estimated. Also an experimental data set of intracellular recordings of the membrane 
potential of a spinal motoneuron of a red-eared turtle is analyzed. 

Keywords: Sequential Monte Carlo; diffusions; pseudo likelihood; Stochastic Approximation Expecta- 
tion Maximization; motoneurons; conductance-based neuron models; membrane potential 

1 Introduction 

In neuroscience, it is of major interest to understand the principles of information processing in the 
nervous system, and a basic step is to understand signal processing and transmission in single neurons. 
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Therefore, there is a growing demand for robust methods to estimate biophysical relevant parameters 
from partially observed detailed models. Statistical inference from experimental data in biophysically 
detailed models of single neurons is difficult. Data are typically of either of two types: extracellular 
recordings, where only the spike times are observed, i.e. a point process, or intracellular recordings, 
where the membrane potential is recorded at high frequency, e.g. observations are taken every 0.1 ms. 
The data are naturally described by single point models, neglecting morphological properties of the 
neuron, since no spatial information is available in the data, and all neuronal properties are collapsed 
into a single point in space. Even so, the models can be multi-dimensional and have far more parameters 
than the data can provide information about. Often these models are compared to experimental data by 
hand-tuning to reproduce the qualitative behaviors observed in experimental data, but without any formal 
statistical analysis. 

The complexity of neuronal single cell models ranges from detailed biophysical descriptions, exempli- 
fied most prominently in the Hodgkin-Huxley model, to simplified one-dimensional diffusion models. 



The Hodgkin-Huxley model ( Hodgkin and Huxley ( 1952 1) is a conductance-based model, which consists 



of four coupled differential equations, one for the membrane voltage, and three equations describing the 
gating variables that model the voltage-dependent sodium and potassium channels. Conductance-based 
models provide a minimal biophysical interpretation for an excitable cell, including the voltage depen- 
dence or time-dependent nature of the conductance due to movement of ions across ion channels in the 
membrane. Several simplifications of the Hodgkin-Huxley model has been proposed, most commonly in 
order to reduce the four-dimensional system to a two-dimensional system, using the time scale separa- 
tions by treating the fast variables as instantaneous variables by a quasi steady-state approximation, and 
by collapsing variables with nearly identical dynamics. 



The Morris-Lecar model (IMorris and Lecar] (|1981[)) has often been used as a good, qualitatively quite 



accurate, two-dimensional model of neuronal spiking. It is a conductance-based model like the Hodgkin- 
Huxley model, introduced to explain the dynamics of the barnacle muscle fiber. It can be considered a 
prototype for a wide variety of neurons. The model is given by two coupled first order differential 
equations, the first modeling the membrane potential evolution, and the second equation modeling the 
activation of potassium current. If both current and conductance noise should be taken into account, 
the stochastic Morris-Lecar model arises, where diffusion terms have been added on both coordinates, 
governed by Wiener processes. Typically, the membrane potential will be measured discretely at high 
frequency, whereas the second variable cannot be observed. Our goal is to estimate parameters of the 
model from discrete observations of the first coordinate. This includes estimation of a central rate param- 
eter characterizing the channel kinetics of the unobserved component, and the two diffusion parameters 
representing the amplitude of the noise. 



In Huys et al. (2006 1 up to 10 parameters are estimated in a detailed multi-compartmental single neuron 
model. Only parameters entering linearly in the loss function are considered, and channel kinetics are 
assumed known. It is a quadratic optimization problem solved by least squares, and shown to work well 
for low noise and high frequency sampling. When either the discretization step or the noise increase, a 
bias is introduced. In Huys and Paninski ( |2009| ) they extend the estimation to allow for measurement 



noise, first smoothing the data by a particle filter, and then maximizing the likehhood through a Monte 
Carlo EM-algorithm. 



In this paper, we propose to approximate the non-linear two-dimensional model through an Euler- 
Maruyama scheme to obtain a tractable pseudo-likelihood. Then we consider the statistical model as 
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an incomplete data model, where the unobserved part is imputed by a Sequential Monte Carlo (SMC) 
algorithm. This is not straightforward for the type of conductance based model, we are studying. It is 
a multi-dimensional coupled stochastic differential equation (SDE), where only one coordinate is ob- 
served. This means that the unobserved coordinate is non-autonomous, and the model does not fit into 
the hidden Markov model framework. Furthermore, the diffusion is not time reversible. Thus, we cannot 



directly apply the SMC algorithm proposed in Del Moral et al. (2001 1, which assumes the non-observed 



data autonomous, nor the algorithms proposed in Feamhead et al. (2008), which assumes the drift to 



be of gradient type. In the ergodic case this corresponds to a time reversible diffusion. In the parti- 
cle filter proposed in our paper, the observed coordinate is not re-simulated. Finally, we maximize the 
pseudo-likelihood, based on an Euler-Maruyama approximation of the SDE defining the model, through 
a Stochastic Approximation Expectation-Maximization (SAEM) algorithm, where the unobserved data 
are imputed at each iteration of the algorithm. We prove that the estimator obtained from this combined 
SAEM-SMC algorithm converges with probability one to a local maximum of the pseudo-likelihood. 
We also prove that the pseudo-likelihood converges to the true likelihood as the time step between ob- 
servations go to zero. 

The paper is organized as follows: In Section 2 the model is presented, the special noise structure, we are 
considering, is motivated, and the pseudo likelihood arising from the Euler-Maruyama approximation is 
found. In Section 3 we present the proposed estimation procedure and the assumptions needed for the 
convergence results to hold. In Section 4 we conduct a simulation study to document the performance of 
the method, and in Section 5 we apply the method on an experimental data set of intracellular recordings 
of the membrane potential of a motoneuron of an adult turtle. Proofs and technical results can be found 
in the Appendix. 



2 Stochastic Morris-Lecar model 
2.1 Exact diffusion model 

We consider a stochastic Morris-Lecar model including both current and channel noise, defined as 

' dVt = f{Vt,Ut)dt + jdBt, 
dUt = b{Vt,Ut)dt +a{Vt,Ut)dBt, 

where 

f{Vt,Ut) = ^{-gcamM){Vt-VcJ-gKUt{Vt-VK)-gL{Vt-VL) + I), 
b{Vt,Ut) = {a{Vt){l - Ut) - p{Vt)Ut) , 

rriooyv) = - 1 + tann 



2 V V ^2 

a{v) = 2'^cosh('9r) + 
/^(^) = -(f) cosh. ( ) ( 1 — tanh ^ 



and the initial condition (Vq, Uq) is random with density p(Vo, Uq). Processes {Bt)t>to and {Bt)t>to 
are independent Brownian motions. The variable Vt represents the membrane potential of the neuron at 



Parameter estimation in neuronal models with particle filter methods 



4 



o 



d 



OJ 

d 



membrane voltage, V(t) 





normalized conductance, U(t) 





in 
d 



d 



C\i 

d 



state space 




500 
time 



1000 



-40 -20 
V, 



20 40 



Figure 1: Simulated trajectory of the stochastic Morris-Lecar model: (V^) as a function of time (left, 
top), {Ut) as a function of time (left, bottom), and {Ut) against (V*) (right). Parameters are given in 
Section [4] Time is measured in ms, voltage in mV, the conductance is normalized between and 1. 



time t, and Ut represents the normalized conductance of the K+ current. This is a variable between 
and 1 , and could be interpreted as the probability that a K"*" ion channel is open at time t. The equation 
for /(•) describing the dynamics of Vt contains four terms, corresponding to Ca^+ current, K+ current, 
a general leak current, and the input current I. The functions «(•) and /?(•) model the rates of open- 
ing and closing, respectively, of the K+ ion channels. The function m,oo(-) represents the equilibrium 
value of the normalized Ca^+ conductance for a given value of the membrane potential. The parameters 
^1,^2, ^3 and V4 are scaling parameters; gca,gK and are conductances associated with Ca^+, K+ 
and leak currents, respectively; Vca, Vk and Vl are reversal potentials for Ca^"*", K"*" and leak currents, 
respectively; C is the membrane capacitance; is a rate scaling parameter for the opening and closing 
of the K+ ion channels; and / is the input current. 

Parameter 7 scales the current noise. Function a{Vt, Ut) models the channel or conductance noise. We 



consider the following function that ensures that Ut stays bounded in the unit interval if o" < 1 (Ditlevsen 



and Greenwood 2012 1 



a{Vt,Ut) = a^2^^^^^^^^^^^Ut{l-Ut). 

A trajectory of the model is simulated in Fig. [T] The peaks of (V^) corresponds to spikes of the neuron. 

Data consist in discrete measurements of (Vt) while {Ut) is not measured. We denote < ti < . . . < tn 
the discrete observation times. We denote Vi = VJ. the observation at time ti and Vb:n = (^o' ■ • • ' ^n) 
the vector of all the observed data. We focus on the estimation of parameters from observations Vo:n- 
Let ^ € B C RP be the vector of parameters to be estimated. We will consider two cases: Estimation of 
all identifiable parameters of the observed coordinate, where all parameters of the unobserved channel 
dynamics are assumed known except for the volatility parameter, 9 = {gca, gK,gL, Vca, Vk, 1, 7, cr); 
and estimation of 6 = {gca, gK,gL, Vca, Vk, 1, 7, (/>), where also the rate parameter of the unob- 
served coordinate is estimated. Note that C is a proportionality factor of the conductance parameters 
and thus unidentifiable, as well as the constant level in /(•) is given by g^VL + I, and thus Vl (or /) is 
unidentifiable. An ideal estimator for 9 is the maximum likelihood estimator, obtained by maximizing 
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the likelihood of Vq-u- However, this likelihood is intractable, as explained below. 

The likelihood function p{VQ:n] 0) of model ([T]l has a complex form for several reasons. Let us first write 
the likelihood function of the ideal case where the second coordinate {Ut) is also discretely observed. Let 
Uq-u denote a realization of {Ut) at observation times to,. . . , tn- Since the vector (Vi, Ui) is Markovian, 
the likelihood p(Vo;n, ^0:n; G) can be written as a product of conditional densities 

n 

p{Vo:n, Uo-.n, 0) = p{Vo, Uq; 6) ^p{Vi, Ui\V,.i, Ui-i; 6), 

i=l 

where p{Vi, Ui\Vi-i, Ui-i; 9) is the transition density of {Vi, Ui) given {Vi-i, Ui-i). Unfortunately, the 
density p{Vi, Ui\Vi-i, U-i; 9) has no explicit form because the diffusion is highly non-linear. Therefore, 
even when {Ut) is discretely observed at the same time points as (V^), the likelihood is not explicit and 
the maximum likelihood estimator is not available. In this ideal case, minimum contrast estimators based 



on the Euler-Maruyama approximation of the diffusion have been proposed by Kessler ( 1997 1. 



When the second coordinate is not observed, the estimation is much more difficult. Indeed, although 
{Vt, Ut) is Markovian, the single coordinate {Vt) is not Markovian. The process {Ut) is a latent or hidden 
variable which has to be integrated out to compute the likelihood 

p{Vo:n; 9)= I ... I p(Vb, f/o; 9) f{p{Vi, UiW.i, Ui.i;9)dUo . . . dU^. (2) 

Again, the transition density p{Vi, Ui\Vi-i, Ui-i; 9) is generally not available and has to be approxi- 
mated. We therefore introduce an approximate diffusion based on the Euler-Maruyama scheme. 



2.2 Approximate diffusion 

Let A denote the step size between two observation times. Here we assume that A does not depend 
on i. The extension to unequally spaced observation times is straightforward. The Euler-Maruyama 
approximation of model ([T]) leads to a discretized model defined as follows 

Vi+i = Vi + Af{Vi,Ui) + ^jVi, (3) 
Ui+i = Ui + Ab{Vi,Ui) + VAa{Vi,U)r,i, 

where {fji) and {rji) are independent centered Gaussian variables. To ease readability the same notation 
{Vi,Ui) is used for the original and the approximated processes. This should not lead to confusion, 
as long as the transition densities are distinguished, as done below. The likelihood of the approximate 
model is equal to 



PA{Vo:n;9) = j ... j p{Vo,Uo;9)f\p^{Vi,Ui\Vi.uU-i;9)dUo . . .dUn, 

i=l 

where p/\{Vi, Ui\Vi-i, U-i; 9) is the transition density of model (j3]l: 

a/ rriT/ u m 1 f {Vi - Vj-i - Af{Vi^,,U^-l))^ 
PA{Vi,Ui\Vi^i,Ui-i;9) = exp ( ^^-^ 



(4) 



V27rA7 V 2A7 

1 f {U-Ui.i- Ab{V,.i,Ui.i))^' 

exp ' 



V27fAa(yi_i,C/i_i) V 2Aa'^{Vi-i,Ui-i) 
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We aim at estimating the maximum of the likehhood (j4]) of the approximate model, which corresponds 
to a pseudo-likelihood for the exact diffusion. 



3 Estimation method 

The multiple integrals of equation Q are difficult to handle. Thus, it is not possible to maximize the 
likelihood directly. 

A solution is to consider the statistical model as an incomplete data model. The observable vector Vb:n is 
then part of a so-called complete vector {Vq-u, Uq-^u), where C/o:n has to be imputed. Simulation under the 
smoothing distribution pA(f^O:n I Vb:n; (^)dUQ-n is likely to be a complex problem, and direct simulation of 
the non-observed data (C/o:^) is not possible. A Sequential Monte Carlo (SMC) algorithm, also known 



as Particle Filtering, provides a way to approximate this distribution (Doucet et al. 2001 1. We have 
adapted this algorithm to handle a coupled two-dimensional SDE, i.e. the unobserved coordinate is 
non-autonomous. Thus, we are not in the more well-behaved situation of a hidden Markov model. 

To maximize the likelihood of the complete data vector (Vbm, U^-n), we propose to use a stochastic 



version of the Expectation-Maximization (EM) algorithm, namely the SAEM algorithm (Delyon et al. 



1999). Thus, we combine the SAEM algorithm with the SMC algorithm, where the unobserved data are 



filtered at each iteration step, to estimate the parameters of model ([3]). Details on the SMC are given in 



Section 3.1 and the SAEM algorithm is presented in Section [3^21 Convergence of this new SAEM-SMC 



algorithm is stated in Section 3.3 



3.1 SMC, a particle filter algorithm 

In this section, the aim is to approximate the distribution PA(C^O:n|^0:n; G)dUQ-n, for a fixed value of 9. 
When included in the stochastic EM algorithm, this value will be the current value 9m of the parameter 
at the given iteration. The SMC algorithm provides a set of K particles (C/o^2)a:=1- -^ weights 
{^lyl)k=i...K approximating the conditional distribution pA(f^O:n I Vbm; (^)dUo-n (see 



Doucet et al. 



2001 



for a complete review). The SMC method relies on proposal distributions q{Ui\Vi, Vi-i, f7j_i; 9)dUi to 
sample what we call particles from these distributions. We write Vb:i = (Vq, ■ ■ ■ ,Vi) and likewise for 

Algorithm 1 (SMC algorithm) 

• At time i = 0: ^ k = 1, . . . ,K 

1. sample Uq''^ from p{Uo\Vo; 6) 



2. compute and normalize the weights: 
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At time i = 1, . . . , n: \lk = l,...,K 

1. resample the particles, i.e. sample the indices Al_^from a multinomial distribution such that 

p{Af\ = i) = w,{ulll^), yi = i,...,K 

and put t/o-j-i = '-'o-.i-i ■ 

2. sample C/f ^ ~ q (-1^,, f/^i?; ^) and set = {u'^f_^, f/f ^) 



3. compute and normalize the weights: 



'0:i 



Finally, the SMC algorithm provides an empirical measure 



k=l 



which is an approximation to the smoothing distribution PA(t^O;n|^0;n; 0)dUo;n- Here, 1^ is the Dirac 
measure in x. A draw from this distribution can be obtained by sampling an index k from a multinomial 

(k) 

distribution with probabilities Wn and setting the draw Uo-n equal to Uo-n = Uq.^. 

As in any importance sampling method, the choice of the proposal distribution q is crucial to ensure good 
convergence properties. The first classical choice of q is q{Ui\Vi, Vi-i, Ui-i;9) = PA{Ui\Vi-i, Ui-i;9), 
i.e. the transition density. In this case, the weight reduces to Wi ( f^n^M = PAiyi\uj^''\ Vi-i, U^'^^ ■ 



A second choice for the proposal is q{Ui\Vi, Vi-i, Ui-i\6) = p/\{Ui\Vi, Vi-i, 0), i.e. the condi 



tional distribution. In this case, the weight reduces to Wi \ Uq.^\ = PA(^i| C^o.j_i; 0). Transition 
densities and conditional distributions are detailed in Appendix |A] for the approximate model Q. When 
the two Brownian motions are independent, as we assume, the two choices are equivalent. 



We can compare this particle filter with previous filters proposed in the literature. Del Moral et al. (2001 1 
propose a particle filter for a two-dimensional SDE, where the second equation is autonomous. Although 
the first coordinate is observed at discrete times, [Del Moral et a/.| ( [200T l propose to simulate the Vq 



at each iteration of the filter with the proposal p/:\{Vi, Ui\Vi-i, f/j-i). The weights are computed with 

(k) 

a bounded function of the difference between the observed value Vi and the simulated particles V-^ . 



Feamhead et al. ( 2008 1 generalise this particle filter with any proposal distribution to simulate the Vi at 



each iteration. The weights then reduce to a Dirac mass of the difference between the observed value Vi 
and the simulated particles V^^"* , which is likely to be almost surely equal to zero. To avoid this problem 
of zero weights, the SMC algorithm proposed here is slightly different as Vb:n is not re-simulated. 



The convergence of Algorithm [T] is studied in Appendix [Pj 
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3.2 SAEM algorithm 



The EM algorithm ( Dempster et al.\\\911 1 is useful in situations where the direct maximization of the 



marginal likelihood PA(Vb:n ; G) is more difficult than the maximization of the conditional expec- 
tation of the complete likelihood 

Q{6\e')=¥.^ [logPA(^O:n,f/O:n;0)|l^O:n;^'] , 

where pA(Vb:n , C/o;n; 0) is the likelihood of the complete data (Vom; C^O;n) of the approximate model ^ 
and the expectation is under the conditional distribution of {7o;n given Vo;n with density pA(f^O;n I Vb:n; G'). 
The EM algorithm is an iterative procedure: at the mth iteration, given the current value ^m-i of the 
parameter, the E-step is the evaluation of Qm(^) = Q{G \ 6m~i), while the M-step updates Om-i by 
maximizing Qm{(^)- To fulfill convergence conditions of the algorithm, we consider the particular case 
of a distribution from an exponential family. More precisely, we assume: 

(Ml) The parameter space is an open subset of M^. The complete likelihood pA(Vb:n, f^om; belongs 
to a curved exponential family, i.e. 

where Tp and v are two functions of 9, S{Vo-n, Uo-n) is known as the minimal sufficient statistic of 
the complete model, taking its value in a subset S of M*^, and (•, •) is the scalar product on M*^. 

The approximate Morris-Lecar model Q satisfies this assumption when we assume the scaling parame- 
ters Vi, V2, V3, V4 known. Details of the sufficient statistic S are given in Appendix IB] 



Under assumption (Ml), the E-step reduces to the computation of Ea 



this expectation has no closed form, Delyon et al. (1999]) propose the Stochastic Approximation EM 



SiVo:n,Uo:n)\Vo ■.rii ^m— 



. When 



algorithm (SAEM) replacing the E-step by a stochastic approximation of Qm{G)- The E-step is then 
divided into a simulation step (S-step) of the non-observed data (C^q.'^'') with the conditional density 
PAiUo:n \Vo:n] dm-i) and a stochastic approximation step (SA-step) of Ea S{Vo:n, Uo:n)\Vo:n] 6m-i ■ 
We write Sm for the approximation of this expectation. Iterations of the SAEM algorithm are written as 
follows: 

Algorithm 2 (SAEM algorithm) 

• Iteration 0: initialization of 6q and set sq = 0. 

• Iteration m> 1: 

S-Step: simulation of the non-observed data {U^"^) conditionally on the observed data from the 
distribution PAiUo-.nlVo-.n, Om~i)dUo;n. 
SA-Step: update Sm-i using the stochastic approximation scheme: 



-1 + dm-l 



(5) 



where {am)meN sequence of positive numbers decreasing to zero. 
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M-Step: update of 9m by: 

9m = argmax(-V'(6') + {sm,i^{9))) . 



At the S-step, the simulation under the smoothing distribution p^iUo-.n |Vo:n; 9m~i)dUo-n is done by 



SMC, as explained in Section 3.1 We call this algorithm the SAEM-SMC algorithm. The standard 
errors of the parameter estimators can be evaluated through the Fisher information matrix. Details are 
given in Appendix [C| 

An advantage of the SAEM algorithm is the low-level dependence on the initialization 9o, due to the 
stochastic approximation of the E-step. The other advantage of the SAEM algorithm over a Monte-Carlo 
EM algorithm is the computational time. Indeed, only one simulation of the hidden variables C/o;„ is 
needed in the simulation step while an increasing number of simulated hidden variables is required in a 
Monte-Carlo EM algorithm. 



3.3 Convergence of the SAEM-SMC algorithm 

The SAEM algorithm we propose in this paper is based on an approximate simulation step performed 
with an SMC algorithm. We prove that even if this simulation is not exact, the SAEM algorithm still 
converges towards the maximum of the likelihood of the approximated diffusion Q. This is true because 
the SMC algorithm has good convergence properties. 

Let us be more precise. We introduce a set of convergence assumptions which are the classic ones for 



the SAEM algorithm (Delyon et al. 1999 1. 



(M2) The functions iIj{9) and u{9) are twice continuously differentiable on B. 

(M3) The function s : @ — > S defined by s{9) = f S{v, u)pA.{u\v; 9)dv du is continuously differen- 
tiable on 0. 

(M4) The function ^a(^) = ^ogp^iv, u, 9) is continuously differentiable on 6 and 

de I PA{v,u;9)dv du = / depA{v,u;9)dv du. 



(MS) Define L : S x & — ^ M by L(s, 9) = -ip{e) + {s, u{9)). There exists a function 9 : S — ^6 
such that 

G e, Vs G S, L{s, 9{s)) > L{s, 9). 

(SAEMl) The positive decreasing sequence of the stochastic approximation {am)m>i is such that = 

oo and Y.rn «m < oo- 

(SAEM2) : — ^ IK and ^ : 5 — )• G are d times differentiable, where d is the dimension of S{v, u). 

(SAEM3) For all G 6, j \\S{v,u)\\'^ pA{u\v;e)du < oo and the function r(0) = Cov0{S{-,Uo..n)) is 
continuous, where the covariance is under the conditional distribution PA(f^O:n| Vo:n; 9). 
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(SAEM4) For any positive Borel function / 



where {Tm} is the increasing family of fi-algebras generated by the random variables sq, Uq^^, 

^0:n' • • • ' ^0:n ■ 

Assumptions (M1)-(M5) ensure the convergence of the EM algorithm when the E-step is exact ( [Delyon 



et a/.[|1999) . Assumptions (M1)-(M5) and (SAEM1)-(SAEM4) together with the additional assumption 



that (sm)m>o takes its values in a compact subset of S ensure the convergence of the SAEM estimates 



to a stationary point of the observed likelihood pAiVo-.n] ^) when the simulation step is exact (Delyon 



etal. 19991. 



Here the simulation step is not exact and we have three additional assumptions on the SMC algorithm to 
bound the error induced by this algorithm and prove the convergence of the SAEM-SMC algorithm. 

(SMCl) The number of particles K used at each iteration of the SAEM algorithm varies along the iteration: 
there exists a function g{m) — )• oo when m — )• oo such that K{m) > g{m) log(m). 

(SMC2) The function S is bounded uniformly in u. 

(SMC3) The functions p^iViWi, K-i, t^j-i; ^) are bounded uniformly in 9. 

Theorem 1. Assume that (Ml )-(M5), (SAEMl )-(SAEM3), and (SMCl )-(SMC3) hold Then, with proba- 
bility 1, limm->oo d{6m^ £) = where £ = € G, deiAid) = 0} is the set of stationary points of the 
log-likelihood £^(0) = ^ogpAiVo-.n', S). 



Theorem [T] is proved in Appendix [Pj Note that assumption (SAEM4) is not needed thanks to the condi- 
tional independence of the particles generated by the SMC algorithm, as detailed in the proof. Similarly, 
the additional assumption that {sm)m>o takes its values in a compact subset of S is not needed, as it is 
directly satisfied under assumption (SMC2). 

We deduce that the SAEM algorithm converges to a (local) maximum of the likelihood under standard 



additional assumptions (L0C1)-(L0C3) proposed byjDelyon et al. (19991 on the regularity of the log- 
likelihood £A{yo:n', ^) that we do not recall here. 

Corollary 1. Under the assumptions of Theorem^and additional assumptions (L0C1)-(L0C3), the 
sequence 9m converges with probability 1 to a (local) maximum of the likelihood pAiVo-.n'-, 



Proof is given in Appendix [Pj 

In practice, the SAEM algorithm is implemented with a fixed number of particles, although an increasing 
number is needed to prove the theoretical convergence. Assumption (SMCl) provides the order of mag- 
nitude of the number of particles needed to obtain satisfactory results for a given number of iterations of 
the SAEM algorithm. 

Assumptions (M1)-(M5) are classically satisfied. Assumption (SAEMl) is easily satisfied by choosing 
properly the sequence (om). Assumptions (SAEM2) and (SAEM3) depend on the regularity of the 
model. They are satisfied for the approximate Morris-Lecar model. 
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Assumption (SMC2) is satisfied for the approximate Morris-Lecar model because the variables U are 
bounded between and 1 and the variables V are fixed at their observed values. This would not have 



been the case with the filter of D el Moral et al. | ( |2C)01| ), which resimulates the variables V at each iteration. 
Assumption (SMC3) is satisfied if we require that 7 is strictly bounded away from zero; 7 > e > 0. 



3.4 Properties of the approximate diffusion 

The SAEM-SMC algorithm provides a sequence which converges to the set of stationary points of the 
log-likelihood l/\{6) = logPA(^0:n; The following result aims at comparing this likelihood, which 
corresponds to the Euler approximate model ([3]), with the true likelihood p(Vo;n; ^) in 

The result is based on the bound of the Euler approximation which has been proved by |Bally and Talay 
( [r996a| . Their result holds under the following assumption 

(HI) Functions /, 6, cr are 2 times differentiable with bounded derivatives with respect to u and v of all 
orders up to 2. 

Let us assume we apply the SAEM algorithm on an approximate model obtained with an Euler scheme 
of step size 6 = A/L. Then we have the following result 

Theorem 2. Under assumption ( HI ), there exists a constant C, independent of 9, such that for any 
9 £ Q, and any vector Vo:n 

\piVo:n;9) - ps{Vo:n;9)\ < C^nA 

Proof is given in Appendix [P} 

Assumption (HI) does not hold for the Morris-Lecar model, but if the state variables were bounded, 
assumption (HI) would hold. This is the case for Ut, whereas Vt is not. Nevertheless, the tails of Vt 
behave as an Omstein-Uhlenbeck process, and could be truncated at an arbitrary large value, and the 
behavior would be approximately the same. 



4 Simulation study 

Parameter values of the Morris-Lecar model used in the simulations are the same as those of ITatenol 



and Pakdaman ( |2004 1 for a class II membrane, except that we set the membrane capacitance constant to 



C = 1 /iF/cm , which is the standard value reported in the literature. Conductances and input current 
were correspondingly changed, and thus, the two models are the same. The values are: Vk = —84 
mV, Vl = -60 mV, Vca = 120 mV, C = l^F/cm^, ql = 0.1 fiS/cm^, gca = 0.22/iS/cm2, qk = 
0.4/iS/cm2, Vl = -1.2 mV, V2 = 18 mV, V3 = 2 mV, V4 = 30 mV, (j) = 0.04 ms"^ Input is chosen 
to be I = 4.5 ijlAJcw?. Initial conditions of the Morris-Lecar model are Vi„ = —26 mV, Ut^ = 0.2. 
The volatility parameters are 7 = 1 mV ms~^/^, a = 0.03 ms^^/^. Trajectories are simulated with 
time step A = 0.1 ms and n = 2000 points, and either 9 = {gca,gK,gL,yca-,yK^I,l,(^) or 9 = 
{9Ca,gK,gL, Vca, Vr , 1, 7, c, (p) are estimated on each simulated trajectory. A hundred repetitions are 
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Figure 2: Filtering of {Ut) with the particle filter algorithm (200 particles): hidden simulated trajectory 
of the Morris-Lecar model {Ut) (black), mean filtered signal (grey full drawn line), 95% confidence 
interval of filtered signal (grey dashed lines). 

used to evaluate the performance of the estimators. An example of a simulated trajectory (for n = 10000) 
is given in Figure [T] 



4.1 Filtering results 

The Particle filter aims at filtering the hidden process {Ut) from the observed process (Vj). We illustrate 
its performance on a simulated trajectory, with parameter 9 fixed at its true values. The SMC Particle 
filter algorithm is implemented with K = 100 particles and the transition density as proposal. Figure |2] 
presents the result of the Particle filter procedure. The true hidden process, the mean filtered signal and 
its 95% confidence interval are plotted. The filtered process appears satisfactory. The confidence interval 
includes the true hidden process {Ut)- 



4.2 Estimation results 



The performance of the SAEM-SMC algorithm is illustrated on 100 simulated trajectories. The SAEM 
algorithm is implemented with m = 100 iterations and a sequence (a^) equal to 1 during the 30 first 
iterations and equal to = \/{m — 30)'^'^ for m > 30. The SMC algorithm is implemented with 
K = 50 particles at each iteration of the SAEM algorithm. Two types of initialization of the SAEM 
algorithm are used. The first is a random initialization of centered around the true value: = 
Gtrue + A/'(0, 1). The sccond type is a random initialization of not centered around the true 

value: = Otme + 0.1 + dtrue/3Af{0, 1). 

An example of the convergence of the SAEM algorithm for one of the iterations is presented in Fig. [3] It 
is seen that the algorithm converges for most of the parameters in very few iterations to a neighborhood 
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Figure 3: Convergence of the SAEM algorithm for the 9 estimated parameters on a simulated data set. 
True values used in the simulation are given by the gray hnes. 

of the true value, even if the initial values are far from the true ones. Only for and a more iterations are 
needed, which is expected since these two parameters appear in the second, non-observed coordinate. 

The SAEM algorithm is used to estimate either 6 = {gca, 9k, 9l, Vca, Vk, 1, 7, i'P fixed at the true 
value) or ^ = {gca, Qk^Ql, Vca, Vk, 1, 7, cr, 4>)- The SAEM estimators are compared with the pseudo 
maximum likelihood estimator obtained if both Vt and Ut were observed. Results are given in Table 
[T] The parameters are well estimated in this ideal case. When only Vt is observed, the initialization of 
the SAEM algorithm has low influence on the results. The parameters are well estimated, except the 
parameter a which is biased. The estimation of 0, which is the only parameter in the drift of the hidden 
coordinate Ut, is good and does not deteriorate the estimation of the other parameters. In Fig. |4]we 
show boxplots of the estimates of the nine parameters for the three estimation settings; both coordinates 
observed, or only one observed with either (p fixed at the true value, or also estimated. All parameters 
appear well estimated, except for a, which is only well estimated when both coordinates are observed. 
As expected, the variances of the estimators of and a hugely increase when only one coordinate is 
observed, but interestingly, the variance of the parameters of the observed coordinate do not seem much 
affected by this loss of information. 
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Parameters 








Estimator gi gca Qk cf 1 


Vk 




Vca 


/ 


true values 0.100 0.220 0.400 0.030 1.00 


-84.00 


0.040 


120.00 


4.400 


With both Vt and Ut observed (pseudo maximum likelihood estimator) 






mean 0.101 0.219 0.411 0.030 0.996 


-83.20 


0.040 


121.97 


4.539 


RMSE 0.017 0.019 0.041 0.001 0.019 


7.61 


0.001 


8.50 


0.560 


With only Vt observed (SAEM estimator) 


(f> fixed at the true value, 6q centered at the true value 










mean 0.093 0.226 0.440 0.060 1.004 


-77.01 




119.61 


3.884 


RMSE 0.016 0.022 0.067 0.031 0.017 


9.80 




10.00 


0.836 


(f) estimated, centered around the true value 










mean 0.092 0.213 0.414 0.058 1.000 


-85.72 


0.046 


123.90 


4.550 


RMSE 0.021 0.022 0.124 0.029 0.019 


10.19 


0.018 


10.90 


1.714 


estimated, 6q not centered around the true value 










mean 0.090 0.225 0.464 0.059 1.003 


-78.622 


0.041 


119.677 


4.060 


RMSE 0.021 0.024 0.144 0.029 0.017 


9.459 


0.013 


10.218 


1.028 


estimated 










SE 0.016 0.019 0.042 0.001 0.016 


4.96 


0.001 


7.31 


0.561 



Table 1: Simulation results obtained from 100 simulated Morris-Lecar trajectories (n = 2000, A = 0.1 
ms). Four estimators are compared: 1/ The pseudo maximum likelihood estimator in the ideal case where 
both Vt and Ut are observed; 21 SAEM estimator when only Vt is observed with the SAEM initialization 
at a random value centered around the true value 6, (j) fixed at the true value; 3/ SAEM estimator when 
only Vt is observed with the SAEM initialization at a random value centered around the true value 6, (/) 
estimated; 4/ SAEM estimator when only Vt is observed with the SAEM initialization at a random value 
not centered around the true value 9. An example of standard errors (SE) estimated with the SAEM-SMC 
algorithm on one single simulated dataset is also given. 
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Figure 4: Boxplots of 100 estimates from simulated data sets for the 9 parameters. True values used in 
the simulations are given by the gray lines. A: Both Vt and Ut are observed. B: Only Vt is observed, 4> is 
fixed at the true value. C: Only Vt is observed, (j) is also estimated. 
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Figure 5: Observations of the membrane potential in a spinal motoneuron of an adult red-eared turtle 
during 600 ms (upper panel), and the filtered hidden process of the normalized conductance associated 
with K+ current (lower panel) for the estimated parameters with the scaling parameters fixed at Vi = 
-2.4 mV, V2 = 36 mV, V3 = 4 mV and V4 = 60 mV. 

The SAEM-SMC algorithm provides estimates of the standard errors (SE) of the estimators (see Ap- 
pendixjC]). These should be close to the RMSE obtained from the 100 simulated datasets. As an example, 
the SE for one dataset estimated by SAEM are reported in the last line of Table [T] The estimated SE are 
satisfactory for most of the parameters, but tends to underestimate. The worst SE estimate is the one 
corresponding to a. This might be explained by the fact that this parameter is estimated with bias. 



5 Intracellular recordings from a turtle motoneuron 



The membrane potential from a spinal motoneuron in segment DIO of an adult red-eared turtle (Trache- 
mys scripta elegans) was recorded while a periodic mechanical stimulus was applied to selected regions 
of the carapace with a sampling step of 0.1 ms (for details see ,Berg et al. ( |2007[ 2008j)). The turtle 
responds to the stimulus with a reflex movement of a limb known as the scratch reflex, causing an in- 
tense synaptic input to the recorded neuron. Due to the time varying stimulus, a model for the complete 



data set needs to incorporate the time-inhomogeneity, as done in Jahn et al. (2011 1. The data can only 
be assumed stationary during short time windows, which is required for the Morris-Lecar model with 
constant parameters. Therefore, we only analyze a short trace where the input is approximately constant 
during an on-cycle (following Jahn et al. (2011 1). The analyzed data are plotted in Fig. [5j together with 
a filtered trace of the unobserved coordinate. 

First the model was fitted with the values of the scaling parameters V1-V4 as in Section |4] Most of 
the estimates are reasonable and in agreement with the expected order of magnitudes for the parameter 
values, except for the Vca reversal potential, which in the literature is reported to be around 100-150 
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Parameter 


Qt Onn Qk 


(7 ^ Vu 




Vrn 


/ 


With Vi = 


-1.2 mV, V2 = 18 mV, V3 = 


2 mV, 1/4 = 30 mV 








Estimate 


-0.962 9.555 7.280 


0.091 3.003 -89.862 


2.916 


44.705 


-2.511 


SE 


0.001 0.049 0.051 


0.000 0.001 4.684 


0.002 


0.597 


0.089 


With Vi = 


-2.4 mV, V2 = 36 mV, V3 = 


4 mV, V4 = 60 mV 








Estimate 


1.292 11.564 18.631 


0.095 2.694 -65.582 


2.683 


106.368 


-65.162 


SE 


- 0.028 0.081 


- 0.025 0.434 


0.120 


0.440 


0.506 



Table 2: Parameter estimates obtained from observations of the membrane potential of a spinal motoneu- 
ron of an adult red-eared turtle during 600 ms for two different sets of scaling parameters. 

mV (estimated to 44.7 mV), and the leak conductance, which is estimated to be negative. Conductances 
are always non-negative. This is probably due to wrong choices of the scaling constants V1-V4. For the 
parameters of the model given in Section |4j the average of the membrane potential Vt between spikes 
is around -26 mV, whereas the average of the experimental trace between spikes is around -56 mV, a 
factor two larger. We therefore rerun the estimation procedure fixing V1-V4 to twice the value from 
before, which provides approximately the same equilibrium values of the normalized Ca^+ conductance, 
moo(-)> ^rid the rates of opening and closing of K+ ion channels, a(-) and as in the theoretical 
model. Results are presented in Table [2] In this case all parameters are reasonable and in agreement with 
the expected order of magnitudes. In Fig. [6]the convergence of the SAEM algorithm is presented. As in 
the simulated data examples, it is seen that the algorithm converges for the parameters of the observed 
coordinate in very few iterations to a neighborhood of some value. Only for the parameters of the un- 
observed coordinate, (p and a, more iterations are needed. For two parameters, gi and a, the estimated 
variances were negative, but very small in absolute values. This can be due to numerical instabilities, and 
should be interpreted as being close to zero. The SEs are probably underestimated, though, as shown in 
the simulation study. 

6 Discussion 

To the authors knowledge, this is the first time the rate parameter of the unobserved coordinate, (/>, is 
estimated from experimental data. It is comforting to observe that the estimated value do not seem to 
be very sensitive to the choice of scaling parameters. Other parameters, like the conductances and the 
reversal potentials, are more sensitive to this choice, and should be interpreted with care. 

The estimation procedure builds on the pseudo likelihood, which approximates the true likelihood by 
an Euler scheme. This approximation is only valid for small sampling step, i.e. for high frequency 
data, which is the case for the type of neuronal data considered here. If data were sampled less often, a 
possibility could be to simulate diffusion bridges between the observed points, and apply the estimation 
procedure to an augmented data set consisting of the observed data and the imputed values. 

There are several issues that deserve further study. First, it is important to understand the influence 
of the scaling parameters Vi — V4, and how to estimate them for a given data set. The model is not 
exponential in these parameters (assumption (Ml)) and new estimation procedures have to be considered. 
Secondly, one should be aware of the possible misspecification of the model. More detailed models 
incorporating further types of ion channels could be explored, but increasing the model complexity might 
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Figure 6: Convergence of the SAEM algorithm for the 9 estimated parameters on the experimental data 
set consisting in observations of the membrane potential of a spinal motoneuron of an adult red-eared 
turtle during 600 ms. 
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deteriorate the estimates, since the information contained in only observing the membrane potential 
is limited. Furthermore, the sensitivity on the choice of tuning parameters of the algorithm, like the 
decreasing sequence of the stochastic approximation, (am), number of SAEM iterations, and the number 
of particles in the SMC algorithm, needs further investigation. Finally, an automated procedure to find 
starting values for the procedure is warranted. 
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Appendix 



A Distributions of approximate model 



Consider the general approximate model 



where p is the correlation coefficient between the two Brownian motions or perturbations. 
The distribution of the couple (V^+i, Ui+i) conditionally on {Vi, Ui) is 



Vi+i 



Vi. + Af{Vi,Ui 
Ui + AbiVi, U] 



ij^ + p^) p(j + aiVi,U,)) 
p{^ + a{Vi,U,)) {a^{Vi,Ui)+p^) 



The marginal distributions of Vi+i conditionally on (V^, Ui) and C/j+i conditionally on {Vi, Ui) are 

V^+i\Vi,Ui ~ M{V^ + Af{V^,Ui),A{J^+p^)) 

U+i\Vi,Ui ~ M{Ui + Ab{Vi,Ui),A{a\Vi,Ui)+p^)) 



(6) 



The conditional distributions of yj+i conditionally on {U+i, Vi,Ui) and U+i conditionally on (Vi+i, Vi, U 



are 



Vi+i\Ui+i,Vi,Ui ~ J\f {mv,Varv) 
Ui+i\Vi+i,Vi,Ui ~ M {mu,Varu) 



(7) 
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where 



T'^iVi, Ui) + 



Vary = A(, ,.iy.,U.)^p^ 

7^ + 

Y + p 

The distributions in Q and ([7]) are equal when the Brownian motions are independent, i.e. when p = 0. 

B Sufficient statistics of the approximate model (|3]) 

We detail some sufficient statistics functions. Consider the n x 6-matrix 

X = (-^0:(n-l)i -"^oo(V'o:(„_i))Fo:(n-l)i -f^O:(n-l) ^0:(n-l) i t^O:(n-l)i li "^oo ( V'o:(n-l) )) 

where 1 is the vector of I's of size n. Then the vector 

'S'l(V'o:(„_i), f/o:(n-l)) = {X' XY'^ X' {Vi;n - V'o:(n-l)) 

is the sufficient statistic vector corresponding to the parameters yi{9) = {gi, 9Ca, 9k , gRVK , 9lVl + 
Ii 9CaVca), where ' denotes transposition. 

The sufficient statistics corresponding to i'2{0) = 1/7^ are 

n n n 

1=1 1=1 i=l 

n n 

i=l 1=1 

The sufficient statistics corresponding to 1^3 (^) = 1/(7^ are 

n n 

J2 {Ui - U,.if , ^{Ui- Ui.i) (a(F,_i)(l - Ui-i)/ct> - ^{Vi^i)Ui-i/(t>) , 

i=l 

n 

J2 (a(^*-i)(i - u.i)/4> - m-iWi-i/^'f ■ 



1=1 i=l 



1=1 



The sufficient statistics corresponding to is also explicit but more complex and not detailed here. 
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C Fisher information matrix 

The standard errors (SE) of the parameter estimators can be evaluated as the diagonal elements of the 
inverse of the Fisher information matrix estimate. Its evaluation is difficult because it has no analytic 



form. We adapt the estimation of the Fisher information matrix, proposed by pelyon et al. \ and 
based on the Louis missing information principle. 

The Hessian of the log-likelihood £a(6') can be expressed as: 

OI^aW = ^[dlL{S{Vo:n,Uo:n),0)\Vo:n.e] 

+ E [deL{S{V^,n,U^..n),e) {deL{S{Vo:n,UQ:n),0))'\VQ..n,e] 

- ^[dBL{S{VQ:n,Uo:n),e)\VQ:n,e] E [aeL(5(yO;n, C/0:n) , ^) | M):n, 0]' • 

The derivatives (9eL(S'(Vo:n, UQ:n), 0) and d'^L{S{VQ-n, Uo-n), 0) are explicit for theEuler approximation 
of the Morris-Lecai" model. Therefore we implement their estimation using the stochastic approximation 
procedure of the SAEM algorithm. At the mth iteration of the algorithm, we evaluate the three following 
quantities: 



Gm+l 
Hm+l 



Hm + CLjn 



Gr, 



deL{S{Vo:nM'^^),9) 

'dlL{SiVo..n, uS),e) + deL{S{Vo:n, u!,Z^),e) ideLiS{Vo..n, uS), 6))' - H, 



m+l 



G 



m+l 



(G 



m+l , 



As the sequence {6m)m converges to the maximum of the likelihood, the sequence {Fm)m converges to 
the Fisher information matrix. 



D Proof of the convergence results 



We start by a Lemma which generalizes the result of Del Moral et al. (2001 1 to the particle filter we 
propose. Then we prove Theorem[T] Corollary [T] and Theorem[2] 



D.l Convergence results of Algorithm [T] 

Let us introduce some notation. For any bounded Borel function / : M i— )• M, we denote 7r„ e/ = 

if {Un)\Vo:n] 0), the Conditional expectation under the exact smoothing distribution pA(f^O:n I Vb:n; 
of the approximate model, and ^^g/ = '^k=i f(.^n'^)^n,e{uly^), the conditional expectation of / 
under the empirical measure obtained by the SMC algorithm for a given value of 9. 



The following lemma is an extension of the result of Del Moral et al. (2001 1 to a particle filter adapted to 
a non-autonomous equation for the second coordinate of the system and in which Vo:n is not resimulated. 

Lemma 1. Under assumption ( SMC3 ), for any e > 0, and for any bounded Borel function f on M, there 
exist constants Ci and C2, independent of 6, such that 



(8) 
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where ||/|| is the sup-norm of f. 



Proof. We omit 6 in the proof for clarity. The conditional expectation 7r„/ can be written 

Il^jUo) ]Ti=lPM, Ui\Vi-i. Ui-l)f{Un)dUo ...dUn 
J KUo)U"=lPA{V^,Ui\Vi-l,U^-l)dUo . . .dUn 

where fj, is the distribution of Uq. We have 



(9) 



f{u)fj,{du). 



Consider for i = 1 , . . . , n, the kernels Hi from M into itself by 

Hif{u)= / pA{Vi,u'\Vi-i,u)f{u)du'. 



(10) 



Then 7r„ can be expressed recursively by 



These kernels are extensions of the kernels considered by 'Del Moral et al. (2001 1. Note that the de- 
nominator of (|9]) is fiHi • • • Hnl = PA{Vo:n), which is different from since it is normal, and bounded 
following from assumption (SMC3). We write i/„ = fj,Hi ■ ■ ■ Hnl for this constant conditioned on the 
observed values Vo:n- Also ( fTO] ) is bounded, i.e. Hi\{u) < C for all n G M and i = 1, . . . , n, for some 
constant C. It directly follows that fiHi ■ ■ ■ Hi^il < C*~^. Furthermore, we obtain the bound 



pHi---Hd > 



fiHi • • • i^j+il 



> . . . > 



c - - C"-* 

Finally, using the above bounds and that 7rj_i is a transition measure, we obtain 

< ^i-iHil<C. 



C 



n-l 



(11) 



Consider the SMC sampled particles in algorithm[T] Define the two empirical measures obtained at time 
i: ^-^ = Ylk=i ^[/'{fc) and = Y.k=i ^il^fiV^/C^)- We also decompose the weights and write 
Tf / = ^ Ef=i fiul^)m{U^). Then Wi{ui,'}) = wdui^l)/{KTfl) and *f / = Tf //Tf 1. 



Recall the following general result ( Del Moral et al.\ |2001[ ) for ^i, . . . , random variables, which 
conditioned on a cr-field Q are independent, centered and bounded < o- Then for any e > we have 



1 ^ 

k=l 



ik 



>e\ < 2exp -K 



2a2 



(12) 



Let / be a bounded function on R. Then under assumption (SMC3) 



k=l 



1 ^ 



k=l 
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fulfills the conditions for ^ to hold with a = 2||/||, since ¥.{f{u[^^^)\g) = ^ff, where g is the 
cT-algebra generated by U^.^ . Thus, for any e > we obtain 



K . 



>e] < 2 exp -K 



(13) 



Likewise, as Hif{u) = Qi{fwi){u) with Qi{f){u) = f q{u'\Vi, u)f{u')du', we have 



which fulfills the conditions for ( [T2| ) to hold, now with a = 2C||/|| and Q is the a-algebra generated by 
C^-i-i- Hence, for any e > we obtain 



We want to show the following two bounds 



>e] < 2 exp -K 



8C2 



{\^ff-Tiif\>e) < 2Iiexp -iC 



e 



>e) < 2/,exp(-i^^ 



8J.II/P 

^2 



i = 0, 1, . . . , n 



(14) 

(15) 
(16) 



by induction on i, for some constants Ii,I^, Ji, increasing with i to be computed later. Note first that 

since ttq = fi and C/q^'^^ are i.i.d. with law fi, then (12i with = fiU^^^) — n{f) yields ([T6]l for i = 
with /• = J =1. Let i> 1 and assume (|T6ll holds for z — 1. We can write 



Tf/ 

7r,_iFa V Tfl 



(vTi^iFil - Tf 1) + (Tf / - 7r,_iF,/) 



Note that Tfl > because the weights Wi are strictly positive. Define Lif = Tff — -Ki^iHif and use 
that |Tf /I < II /II Tfl (because / is bounded) and (|TT]| to see that 

/^n— 1 

|^'f/-vr,/| < (||/|||La| + |L,/|) (17) 



and 

\L^f\ < \Tff-^[^,H,f\ + \^[^,HJ-7T,^,H,f\. 

Assuming that ([16]) holds for z - 1 and using ([14]) and that ||-f^i/|| < C||/|| yield 



(18) 



r{\LJ\>£) < 2exp{-K 



32C72 



+ 2/i„iexp -K 



32jf 



We obtain 



^ff-7Tif\>e) < 



Lil\ > 



2C 



n-l 



+ ^{\Uf\ > 



< 4exp -K 



128C2'' 



2C"-i 
+ 4/;_i exp f -K 



128j;_iC2-||/p 
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Hence, ^ holds with k > 2(1 + and Ji > IGC^" J^_-^/f 2 > since i/„ < C". By ^ 



and ( [T5] ) we then conclude that ([T6]l also holds for i if J^' = 1 + li and J^' = 4Jj. These conditions are 
fulfilled by choosing k = 3'+^ - 3 and Ji = 16'. 

Thus, ^ holds with Ci = 6(3" - 1) and C2 = 8 • 16". This concludes the proof. □ 
D.2 Proof of Theorem [1] 

Proof. To prove the convergence of the SAEM-SMC algorithm, we study the stochastic approximation 
scheme used during the SA step. The stochastic approximation Q can be decomposed into: 

with 

where we denote by vr^ -^S = IEA('S'(Vo:n, f^:n)|^0:n; G) the expectation of the sufficient statistic S 
under the exact distribution pA(f^:n|Vb:n; and by xj/^^™-' 5" the expectation of the sufficient statistic 
S under the empirical measure obtained with the SMC algorithm with K{m) particles and current value 
of parameters 9{sm) at iteration m of the SAEM-SMC algorithm. 



Following Theorem 2 of Delyon et al. (1999 1 on the convergence of the Robbins-Monro scheme, the 



convergence of the SAEM-SMC algorithm is ensured if we prove the following assertions: 

1. The sequence (sm)m>o takes its values in a compact set. 

2. The function V{s) = -£a{0{s)) is such that for all s e S, F{s) = {dsV{s), h{s)) < and such 
that the set V{{s, F{s) = 0}) is of zero measure. 

3. limm^oo YllLi O'f.^e. exists and is finite with probability 1. 

4. limm^-oo ''m = with probabiUty 1. 

Assertion 1 follows from assumption (SMC2) and by construction of Sm in formula ([5]). 

Assertion 2 is proved by Lemma 2 of Delyon et al. (1999]) under assumptions (M1)-(M5) and (SAEM2). 



Assertion 3 is proved similarly as Theorem 5 of Delyon et al. (1999]). By construction of the SMC 



algorithm, the equivalent of assumption (SAEM3) is checked for the expectation taken under the ap- 

K(m) 

proximate empirical measure ^ ^ . Indeed, the assumption of independence of the non-observed vari- 
ables • • • ' ^o"n given ^0) • • • 1 is Verified. As a consequence, for any positive Borel function /, 
jg^(m)j-j^^^m+i)-j|j7^^ _ ^K{m)j:^ Then o^e^ is a martingale, boundcd in L2 under assumptions 
(M5) and (SAEM1)-(SAEM2). 
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To verify assertion 4, we use Lemma[T] Under assumptions (SMC2)-(SMC3) and assertion 1, Lemma[T] 
yields that for any e > 0, there exist two constants Ci, C2, independent of 6, such that 



M 



r„,\ > £) 



771=1 



M 
m=l 



M 



< CiY^ exp ( -K{ 



.T,K(m) 

n,e{s.m) n,0{sm) 

2 



5 



> £ 



m=l 



£ 



[m] 



C2\\S\\ 



Finally, assumptions (SMC1)-(SMC2) imply that there exists a constant C3, independent of 9, such that 

M ^1 

m=l m=l 

which is finite when M goes to cxd. This proves the a.s. convergence of to 0. 

□ 



D.3 Proof of Corollary [T] 



Proof. Theorem 6 of Delyon et al. ( 1999 1 can be extended without difficulty to our algorithm. It proves 
that under assumptions of Theorem [l] and (LOCI), the sequence 6jn converge s to a fixed point o f the 
EM-mapping T(e^) = eisiOm)). Assunip tions (LOC2)-(LOC3), Lemma 3 of jOelyon gf a/.| ( |l999| ) and 
application of Brandiere and Duflo ( 1996 1 imply that the sequence 9m converges with probability 1 to a 
proper maximum of the likelihood. □ 



D.4 Proof of Theorem H 

Proof. The Markov property yields 

\p{V0:n; 9) - P6iVo:n; 9)\ < [ \p{Vo:n, Uo-.n, 9) - psiV^-.n, Uq-.u, 9)\ dUo:n 



< 



llp{V^,Ui\Vi-uUi.r,9) - Hps {Vi,Ui\V^.i,Ui.i;9) 



i=l 



i=l 



dUn._ 



0:n 



/n 
Y,\p{V^,Ui\Vi.l,U,.l■,9)-ps{Vi,U,\Vi.l,Ui.v,i 
i=i 



llp{Vj,U,\Vj-i,Uj-i;9) II ps{Vj,U,\Vj^i,Uj^i;9)dUo:r. 

j=l j=i+l 



Bally and Talay ( |1996a|b| provide that under assumption (HI), there exist constants Ci > 0, C2 > 0, 
C3 > 0, C4 > independent of 9 such that 

\psiV,, Ui\Vi.i, U,^i-9) - p{Vi, Ui\Vi^i, Ui-i;9)\ < 5 Cge-^^IK^-^^^-^^-i'^-i)"' 
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We deduce that for alH = 1, . . . , n, there exists a constant C > independent of 9 such that 

„ i—l 

I \p {Vi, Ui\Vi^i, Ui-v,e) - PS {Vi, Ui\Vi-i, Ui-i; 9)\ HpiVj, Uj\Vj-i, Uj-i; 9) 

n 

X n ^5 {Vj, Uj\Vj-i,Uj-V,9) dUo:n < C6 

j=i+l 

Finally, we get |p(Vb:n; 0) - PsiVo-.n, 9)\ < CnS = Cj^nA. □ 
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